The World Space example had a few annoyances in it. Of particular pain was the fact that, whenever the perspective projection matrix or the world-to-camera matrix changed, we had to change uniforms in 3 programs. They all used the same value; it seems strange that we should have to go through so much trouble to change these uniforms.
Also, 3 programs is a relatively simple case. When dealing with real examples, the number of programs can get quite large.
There is a way to share uniforms between programs. To do this, we use a buffer object to store uniform data, and then tell our programs to use this particular buffer object to find its uniform data. A buffer object that stores uniforms is commonly called a uniform buffer object.
It is important to understand that there is nothing special about a uniform buffer.
Any of the things you could do with a regular buffer object can be done with a uniform
buffer object. You can bind it to the GL_ARRAY_BUFFER
and use it for
vertex data, you can use it for indexed rendering with
GL_ELEMENT_ARRAY_BUFFER
, and many other things that buffer
objects can be used for. Now granted, that doesn't mean that you should, only that you
can.
The example World with UBO uses a uniform buffer object to store the camera and perspective matrices.
This begins with how the vertex shaders are defined.
Example 7.10. UBO-based Vertex Shader
#version 330 layout(location = 0) in vec4 position; layout(std140) uniform GlobalMatrices { mat4 cameraToClipMatrix; mat4 worldToCameraMatrix; }; uniform mat4 modelToWorldMatrix; void main() { vec4 temp = modelToWorldMatrix * position; temp = worldToCameraMatrix * temp; gl_Position = cameraToClipMatrix * temp; }
The definition of GlobalMatrices looks like a struct definition, but it is not. It defines a uniform block. A uniform block is a series of uniform definitions whose data is not stored in the program object, but instead must come from a uniform buffer.
The name GlobalMatrices is used to identify this particular uniform block. This block has two members, both of the of mat4 type. The order of the components in a uniform block is very important.
Notices that nothing else needs to change in the vertex shader. The
modelToWorldMatrix
is unchanged, and the use of the
components of the uniform block do not even need to be scoped with the
GlobalMatrices name.
The “layout(std140)” part modifies the definition of the uniform block. Specifically, it specifies the uniform block layout.
Buffer objects are unformatted arrays of bytes. Therefore, something must determine how the shader interprets a uniform buffer object's contents. OpenGL itself defines this to a degree, but the layout qualifier modifies the definition.
OpenGL is very clear about how each element within a uniform block is laid out. Floating-point values are just the C++ representation of floats, so you can copy them directly from objects like glm::vec4.
Matrices are slightly trickier due to the column-major vs. row-major issue. The
glUniformMatrix*
functions all had a parameter that defines
what order the matrix data given to the function is in. Similarly, a
“layout” qualifier can specify “row-major” or
“column-major”; these tell OpenGL how the matrices are stored in
the buffer object. The default is “column-major,” and since GLM stores
its matrices in column-major order, we can use the defaults.
What OpenGL does not directly specify is the spacing between elements in the uniform block. This allows different hardware to position elements where it is most efficient for them. Some shader hardware can place 2 vec3's directly adjacent to one another, so that they only take up 6 floats. Other hardware cannot handle that, and must pad each vec3 out to 4 floats.
Normally, this would mean that, in order to set any values into the buffer object, you would have to query the program object for the byte offsets for each element in the uniform block.
However, by using the “std140” layout, this is not necessary. The “std140” layout has an explicit layout specification set down by OpenGL itself. It is basically a kind of lowest-common-denominator among the various different kinds of graphics hardware. The upside is that it allows you to easily know what the layout is without having to query it from OpenGL. The downside is that some space-saving optimizations may not be possible on certain hardware.
One additional feature of “std140” is that the uniform block is
sharable. Normally, OpenGL allows the GLSL compiler considerable leeway to make
optimizations. In this instance, if a GLSL compiler detects that a uniform is unused
in a program, it is allowed to mark it as unused.
glGetUniformLocation
will return -1. It's actually legal to
set a value to a location that is -1, but no data will actually be set.
If a uniform block is marked with the “std140” layout, then the ability to disable uniforms in within that block is entirely removed. All uniforms must have storage, even if this particular program does not use them. This means that, as long as you declare the same uniforms in the same order within a block, the storage for that uniform block will have the same layout in any program. This means that multiple different programs can use the same uniform buffer.
The other two alternatives to “std140” are “packed” and “shared”. The default, “shared,” prevents the uniform optimization, thus allowing the block's uniform buffer data to be shared among multiple programs. However, the user must still query layout information about where each uniform is stored. “packed” allows uniform optimization, so these blocks cannot be shared between programs at all.
For our needs, “std140” is sufficient. It's also a good first step in any implementation; moving to “packed” or “shared” as needed should generally be done only as an optimization. The rules for the “std140” layout are spelled out explicitly in the OpenGL Specification.
Uniforms inside a uniform block do not have individual uniform locations. After all, they do not have storage within a program object; their data comes from a buffer object.
So instead of calling glGetUniformLocation, we have a new function.
data.globalUniformBlockIndex =
glGetUniformBlockIndex(data.theProgram, "GlobalMatrices");
The function glGetUniformBlockIndex
takes a program object
and the name of a uniform block. It returns a uniform block
index that is used to refer to this uniform block. This is similar
to how a uniform location value is used to refer to a uniform, rather than directly
using its string name.
Now that the programs have a uniform block, we need to create a buffer object to store our uniforms in.
Example 7.11. Uniform Buffer Creation
glGenBuffers(1, &g_GlobalMatricesUBO); glBindBuffer(GL_UNIFORM_BUFFER, g_GlobalMatricesUBO); glBufferData(GL_UNIFORM_BUFFER, sizeof(glm::mat4) * 2, NULL, GL_STREAM_DRAW); glBindBuffer(GL_UNIFORM_BUFFER, 0);
For all intents and purposes, this is identical to the way we created other buffer
objects. The only difference is the use of the GL_UNIFORM_BUFFER
binding target.
The GL_ARRAY_BUFFER
target has a specific meaning. When
something is bound to that target, calling
glVertexAttribPointer
will cause the buffer object bound to
that target to become the source for that particular attribute, as defined by the
function call. The GL_ELEMENT_ARRAY_BUFFER
target also has a
meaning; it specifies where indices come from for indexed rendering. The element
array binding is even stored as part of a VAO's data (recall that the array buffer
binding is not stored in the VAO).
GL_UNIFORM_BUFFER
does not really have an intrinsic meaning
like these other two. Having something bound to this binding means nothing as far as
any other function of OpenGL is concerned. Oh, you can call buffer object functions
on it, like glBufferData as above. But it does not have any other role to play in
rendering. The main reason to use it is to preserve the contents of more useful
binding points. It also communicates to someone reading your code that this buffer
object is going to be used to store uniform data.
This is not entirely 100% correct. OpenGL is technically allowed to infer
something about your intended use of a buffer object based on what target you
first use to bind it. So by allocating storage for this
buffer in GL_UNIFORM_BUFFER
, we are signaling something to
OpenGL, which can change how it allocates storage for the buffer.
However, OpenGL is not allowed to make any behavioral
changes based on this. It is still legal to use a buffer allocated on the
GL_UNIFORM_BUFFER
target as a
GL_ARRAY_BUFFER
or in any other buffer object usage. It
just may not be as fast as you might want.
We know that the size of this buffer needs to be two glm::mat4's in size. The “std140” layout guarantees this. That and the size of glm::mat4, which just so happens to correspond to how large a GLSL mat4 is when stored in a uniform buffer.
The reshape
function is guaranteed to be called after our
init
function. That's why we can allocate this buffer
without filling in a default matrix. The reshape function is as follows:
Example 7.12. UBO-based Perspective Matrix
void reshape (int w, int h) { glutil::MatrixStack persMatrix; persMatrix.Perspective(45.0f, (w / (float)h), g_fzNear, g_fzFar); glBindBuffer(GL_UNIFORM_BUFFER, g_GlobalMatricesUBO); glBufferSubData(GL_UNIFORM_BUFFER, 0, sizeof(glm::mat4), glm::value_ptr(persMatrix.Top())); glBindBuffer(GL_UNIFORM_BUFFER, 0); glViewport(0, 0, (GLsizei) w, (GLsizei) h); glutPostRedisplay(); }
This function just uses glBufferSubData
to upload the matrix
data to the buffer object. Since we defined the perspective matrix as the first
matrix in our uniform block, it is uploaded to byte 0.
The display
function is what uploads the world-to-camera
matrix to the buffer object. It is quite similar to what it used to be:
Example 7.13. UBO-based Camera Matrix
const glm::vec3 &camPos = ResolveCamPosition(); glutil::MatrixStack camMatrix; camMatrix.SetMatrix(CalcLookAtMatrix(camPos, g_camTarget, glm::vec3(0.0f, 1.0f, 0.0f))); glBindBuffer(GL_UNIFORM_BUFFER, g_GlobalMatricesUBO); glBufferSubData(GL_UNIFORM_BUFFER, sizeof(glm::mat4), sizeof(glm::mat4), glm::value_ptr(camMatrix.Top())); glBindBuffer(GL_UNIFORM_BUFFER, 0);
The world-to-camera matrix is the second matrix, so we start the upload at the end of the previous matrix.
Thus far, we have a uniform buffer object into which we store matrices. And we have a program that has a uniform block that needs a uniform buffer to get its uniforms for. Now, the final step is to create the association between the uniform block in the programs and the uniform buffer object itself.
Your first thought might be that there would be a function like glUniformBuffer that takes a program, a uniform block index, and the uniform buffer to associate that block with. But this is not the case; attaching a uniform buffer to a program's block is more complicated. And this is a good thing if you think about it.
It works like this. The OpenGL context (effectively a giant struct containing each piece of data used to render) has an array of uniform buffer binding points. Buffer objects can be bound to each of these binding points. For each uniform block in a program, there is a reference, not to a buffer object, but to one of these uniform buffer binding points. This reference is just a numerical index: 0, 1, 2, etc.
A diagram should make it clearer:
The program object is given an index that represents one of the slots in the context. The uniform buffer is bound to one of those slots. Therefore, when you render with that program, the uniform buffer that is in the slot specified by the program will be where the program gets its uniform data from.
Therefore, to use a uniform buffer, one first must tell the program object which
binding point in the context to find the buffer. This association is made with the
glUniformBlockBinding
function.
glUniformBlockBinding(data.theProgram, data.globalUniformBlockIndex, g_iGlobalMatricesBindingIndex);
The first parameter is the program, the second is the uniform block index queried before. The third is the uniform buffer binding point that this block should use.
After doing this for each program, the uniform buffer must be bound to that
binding point. This is done with a new function,
glBindBufferRange
.
glBindBufferRange(GL_UNIFORM_BUFFER, g_iGlobalMatricesBindingIndex, g_GlobalMatricesUBO, 0, sizeof(glm::mat4) * 2);
This functions similarly to glBindBuffer
; in addition to
binding the buffer to the GL_UNIFORM_BUFFER
target, it also binds
the buffer to the given uniform buffer binding point. Lastly, it provides an offset
and range, the last two parameters. This allows you to put uniform data in arbitrary
places in a buffer object. You could have the uniform data for
several uniform blocks in several programs all in one
buffer object. The range parameters would be how to say where that block's data
begins and how big it is.
The reason this is better than directly binding a buffer object to the program
object can be seen in exactly where all of this happens. Both of these functions are
called as part of initialization code. glUniformBlockBinding
is
called right after creating the program, and similarly
glBindBufferRange
is called immediately after creating the
buffer object. Neither one needs to ever be changed. Yes, we change the contents of
the buffer object. But where it is bound never changes.
The global constant g_iGlobalMatricesBindingIndex
is, as the
name suggests, global. By convention, all programs get their buffer data from this
index. Because of this convention, if we wanted to use a different buffer, we would
not have to update every program that needs to use that buffer. Sure, for one or two
programs, that would be a simple operation. But real applications can have hundreds
of programs. Being able to establish this kind of convention makes using uniform
buffer objects much easier than if they were directly associated with
programs.
In the World Space example, we drew the camera's look-at target directly in camera space, bypassing the world-to-camera matrix. Doing that with uniform buffers would be harder, since we would have to set the uniform buffer value twice in the same draw call. This is not particularly difficult, but it could be a drain on performance.
Instead, we just use the camera's target position to compute a model-to-world matrix that always positions the object at the target point.
Example 7.14. Viewing Point with UBO
glDisable(GL_DEPTH_TEST); glutil::PushStack push(modelMatrix); modelMatrix.Translate(g_camTarget); modelMatrix.Scale(1.0f, 1.0f, 1.0f); glUseProgram(ObjectColor.theProgram); glUniformMatrix4fv(ObjectColor.modelToWorldMatrixUnif, 1, GL_FALSE, glm::value_ptr(modelMatrix.Top())); g_pCubeColorMesh->Render(); glUseProgram(0); glEnable(GL_DEPTH_TEST);
We do not get the neat effect of having the object always face the camera though. We still shut off the depth test, so that we can always see the object.